本文提出了一种新颖的方法,用于在具有复杂拓扑结构的地下领域的搜索和救援行动中自动合作。作为CTU-Cras-Norlab团队的一部分,拟议的系统在DARPA SubT决赛的虚拟轨道中排名第二。与专门为虚拟轨道开发的获奖解决方案相反,该建议的解决方案也被证明是在现实世界竞争极为严峻和狭窄的环境中飞行的机上实体无人机的强大系统。提出的方法可以使无缝模拟转移的无人机团队完全自主和分散的部署,并证明了其优于不同环境可飞行空间的移动UGV团队的优势。该论文的主要贡献存在于映射和导航管道中。映射方法采用新颖的地图表示形式 - 用于有效的风险意识长距离计划,面向覆盖范围和压缩的拓扑范围的LTVMAP领域,以允许在低频道通信下进行多机器人合作。这些表示形式与新的方法一起在导航中使用,以在一般的3D环境中可见性受限的知情搜索,而对环境结构没有任何假设,同时将深度探索与传感器覆盖的剥削保持平衡。所提出的解决方案还包括一条视觉感知管道,用于在没有专用GPU的情况下在5 Hz处进行四个RGB流中感兴趣的对象的板上检测和定位。除了参与DARPA SubT外,在定性和定量评估的各种环境中,在不同的环境中进行了广泛的实验验证,UAV系统的性能得到了支持。
translated by 谷歌翻译
我们提出了通过现实的模拟和现实世界实验来支持可复制研究的多运动无人机控制(UAV)和估计系统。我们提出了一个独特的多帧本地化范式,用于同时使用多个传感器同时估算各种参考框架中的无人机状态。该系统可以在GNSS和GNSS贬低的环境中进行复杂的任务,包括室外室内过渡和执行冗余估计器,以备份不可靠的本地化源。提出了两种反馈控制设计:一个用于精确和激进的操作,另一个用于稳定和平稳的飞行,并进行嘈杂的状态估计。拟议的控制和估计管道是在3D中使用Euler/Tait-Bryan角度表示的,而无需使用Euler/Tait-Bryan角度表示。取而代之的是,我们依靠旋转矩阵和一个新颖的基于标题的惯例来代表标准多电流直升机3D中的一个自由旋转自由度。我们提供了积极维护且有据可查的开源实现,包括对无人机,传感器和本地化系统的现实模拟。拟议的系统是多年应用系统,空中群,空中操纵,运动计划和遥感的多年研究产物。我们所有的结果都得到了现实世界中的部署的支持,该系统部署将系统塑造成此处介绍的表单。此外,该系统是在我们团队从布拉格的CTU参与期间使用的,该系统在享有声望的MBZIRC 2017和2020 Robotics竞赛中,还参加了DARPA SubT挑战赛。每次,我们的团队都能在世界各地最好的竞争对手中获得最高位置。在每种情况下,挑战都促使团队改善系统,并在紧迫的期限内获得大量高质量的体验。
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
In the future, service robots are expected to be able to operate autonomously for long periods of time without human intervention. Many work striving for this goal have been emerging with the development of robotics, both hardware and software. Today we believe that an important underpinning of long-term robot autonomy is the ability of robots to learn on site and on-the-fly, especially when they are deployed in changing environments or need to traverse different environments. In this paper, we examine the problem of long-term autonomy from the perspective of robot learning, especially in an online way, and discuss in tandem its premise "data" and the subsequent "deployment".
translated by 谷歌翻译
Factorization machines (FMs) are a powerful tool for regression and classification in the context of sparse observations, that has been successfully applied to collaborative filtering, especially when side information over users or items is available. Bayesian formulations of FMs have been proposed to provide confidence intervals over the predictions made by the model, however they usually involve Markov-chain Monte Carlo methods that require many samples to provide accurate predictions, resulting in slow training in the context of large-scale data. In this paper, we propose a variational formulation of factorization machines that allows us to derive a simple objective that can be easily optimized using standard mini-batch stochastic gradient descent, making it amenable to large-scale data. Our algorithm learns an approximate posterior distribution over the user and item parameters, which leads to confidence intervals over the predictions. We show, using several datasets, that it has comparable or better performance than existing methods in terms of prediction accuracy, and provide some applications in active learning strategies, e.g., preference elicitation techniques.
translated by 谷歌翻译
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io
translated by 谷歌翻译
This case study investigates the extent to which a language model (GPT-2) is able to capture native speakers' intuitions about implicit causality in a sentence completion task. We first reproduce earlier results (showing lower surprisal values for pronouns that are congruent with either the subject or object, depending on which one corresponds to the implicit causality bias of the verb), and then examine the effects of gender and verb frequency on model performance. Our second study examines the reasoning ability of GPT-2: is the model able to produce more sensible motivations for why the subject VERBed the object if the verbs have stronger causality biases? We also developed a methodology to avoid human raters being biased by obscenities and disfluencies generated by the model.
translated by 谷歌翻译
Semi-supervised anomaly detection is a common problem, as often the datasets containing anomalies are partially labeled. We propose a canonical framework: Semi-supervised Pseudo-labeler Anomaly Detection with Ensembling (SPADE) that isn't limited by the assumption that labeled and unlabeled data come from the same distribution. Indeed, the assumption is often violated in many applications - for example, the labeled data may contain only anomalies unlike unlabeled data, or unlabeled data may contain different types of anomalies, or labeled data may contain only 'easy-to-label' samples. SPADE utilizes an ensemble of one class classifiers as the pseudo-labeler to improve the robustness of pseudo-labeling with distribution mismatch. Partial matching is proposed to automatically select the critical hyper-parameters for pseudo-labeling without validation data, which is crucial with limited labeled data. SPADE shows state-of-the-art semi-supervised anomaly detection performance across a wide range of scenarios with distribution mismatch in both tabular and image domains. In some common real-world settings such as model facing new types of unlabeled anomalies, SPADE outperforms the state-of-the-art alternatives by 5% AUC in average.
translated by 谷歌翻译
We propose a method for in-hand 3D scanning of an unknown object from a sequence of color images. We cast the problem as reconstructing the object surface from un-posed multi-view images and rely on a neural implicit surface representation that captures both the geometry and the appearance of the object. By contrast with most NeRF-based methods, we do not assume that the camera-object relative poses are known and instead simultaneously optimize both the object shape and the pose trajectory. As global optimization over all the shape and pose parameters is prone to fail without coarse-level initialization of the poses, we propose an incremental approach which starts by splitting the sequence into carefully selected overlapping segments within which the optimization is likely to succeed. We incrementally reconstruct the object shape and track the object poses independently within each segment, and later merge all the segments by aligning poses estimated at the overlapping frames. Finally, we perform a global optimization over all the aligned segments to achieve full reconstruction. We experimentally show that the proposed method is able to reconstruct the shape and color of both textured and challenging texture-less objects, outperforms classical methods that rely only on appearance features, and its performance is close to recent methods that assume known camera poses.
translated by 谷歌翻译
Adversarial attacks in NLP challenge the way we look at language models. The goal of this kind of adversarial attack is to modify the input text to fool a classifier while maintaining the original meaning of the text. Although most existing adversarial attacks claim to fulfill the constraint of semantics preservation, careful scrutiny shows otherwise. We show that the problem lies in the text encoders used to determine the similarity of adversarial examples, specifically in the way they are trained. Unsupervised training methods make these encoders more susceptible to problems with antonym recognition. To overcome this, we introduce a simple, fully supervised sentence embedding technique called Semantics-Preserving-Encoder (SPE). The results show that our solution minimizes the variation in the meaning of the adversarial examples generated. It also significantly improves the overall quality of adversarial examples, as confirmed by human evaluators. Furthermore, it can be used as a component in any existing attack to speed up its execution while maintaining similar attack success.
translated by 谷歌翻译